Search CORE

254 research outputs found

Optimistic chordal coloring: a coalescing heuristic forSSAform programs

Author: Brisk Philip
Ienne Paolo
Verma Ajay
Publication venue
Publication date: 18/06/2018
Field of study

The interference graph for a procedure in Static Single Assignment (SSA) Form is chordal. Since the k-colorability problem can be solved in polynomial-time for chordal graphs, this result has generated interest in SSA-based heuristics for spilling and coalescing. Since copies can be folded during SSA construction, instances of the coalescing problem under SSA have fewer affinities than traditional methods. This paper presents Optimistic Chordal Coloring (OCC), a coalescing heuristic for chordal graphs. OCC was evaluated on interference graphs from embedded/multimedia benchmarks: in all cases, OCC found the optimal solution, and ran, on average, 2.30× faster than Iterated Register Coalescin

RERO DOC Digital Library

Optimized Memory Access For Dynamically Scheduled High Level Synthesis

Author: Bhattacharyya Atri
Ienne Paolo
Publication venue
Publication date: 03/12/2018
Field of study

Dynamically-scheduled elastic circuits generated by High-Level Synthesis (HLS) tools are inherently out-of-order, following the flow of data rather than the evolution of an instruction pointer. Components of the circuit which access memory need to be connected to a Load-Store Queue (LSQ) that dynamically checks for memory dependencies, performs store ordering and forwarding, and allows unordered access to Random-Access Memory (RAM) whenever possible. While connecting every memory access (load/store) component to an LSQ ensures correctness of program execution, the hardware and power cost makes this solution unattractive. Statically ruling out dependencies allows circuits to access memory via lightweight components that use an arbitrator to handle RAM port sharing. Reducing the number of components using the LSQ allows the compiler to generate smaller queues which results in superlinear savings in hardware and power for the memory subsystem. This work describes additions to the Elastic Compiler (EC) that allow it to analyze algorithms expressed in LLVM-IR, an intermediate code representation, to rule out memory dependencies between load/store instructions and their underlying insights. These analyses leverage pointer analysis as well as array access patterns to narrow down the list of possibly dependent instructions. We also enhance the compiler to leverage our analyses and automatically generate relevant memory-access components for the circuit and to connect them to the relevant arbitrator or LSQ

Infoscience - École polytechnique fédérale de Lausanne

Author Guidelines for MSS Symposium Proceedings

Author: Ethan L. Miller
Paolo Ienne
Rodney Van Meter
Publication venue
Publication date
Field of study

The abstract is to be in fully-justified italicized text, at the top of the left-hand column, below the author and affiliation information. Use the word “Abstract ” as the title, in 12-point Times, boldface type, centered relative to the column, initially capitalized. The abstract is to be in 10-point, single-spaced type. The abstract may be up to 3 inches (7.62 cm) long. Leave two blank lines after the Abstract, then begin the main text. 1

CiteSeerX

Virtualized execution runtime for FPGA accelerators in the cloud

Author: Asiatici Mikhail
Fahmy Suhaib A.
George Nithin
Ienne Paolo
Vipin Kizheppatt
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

FPGAs offer high performance coupled with energy efficiency, making them extremely attractive computational resources within a cloud ecosystem. However, to achieve this integration and make them easy to program, we first need to enable users with varying expertise to easily develop cloud applications that leverage FPGAs. With the growing size of FPGAs, allocating them monolithically to users can be wasteful due to potentially low device utilization. Hence, we also need to be able to dynamically share FPGAs among multiple users. To address these concerns, we propose a methodology and a runtime system that together simplify the FPGA application development process by providing: 1) a clean abstraction with high-level APIs for easy application development; 2) a simple execution model that supports both hardware and software execution; and 3) a shared memory-model which is convenient to use for the programmers. Akin to an operating system on a computer, our lightweight runtime system enables the simultaneous execution of multiple applications by virtualizing computational resources, i.e., FPGA resources and on-board memory, and offers protection facilities to isolate applications from each other. In this paper, we illustrate how these features can be developed in a lightweight manner and quantitatively evaluate the performance overhead they introduce on a small set of applications running on our proof of concept prototype. Our results demonstrate that these features only introduce marginal performance overheads. More importantly, by sharing resources for simultaneous execution of multiple user applications, our platform improves FPGA utilization and delivers higher aggregate throughput compared to accessing the device in a time-shared manner

Infoscience - École polytechnique fédérale de Lausanne

Crossref

Warwick Research Archives Portal Repository

Mobile Robot Miniaturization: A Tool for Investigation in Control Algorithms

Author: Franzi Edoardo
Ienne Paolo
Mondada Francesco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 08/08/2005
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Exact and Approximate Algorithms for the Extension of Embedded Processor Instruction Sets

Author: Atasu Kubilay
Ienne Paolo
Pozzi Laura
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/08/2005
Field of study

Infoscience - École polytechnique fédérale de Lausanne

Seamless Hardware-Software Integration in Reconfigurable Computing Systems

Author: Ienne Paolo
Pozzi Laura
Vuletić Miljan
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 08/08/2005
Field of study

Infoscience - École polytechnique fédérale de Lausanne

A Predictable Communication Scheme for Embedded Multiprocessor Systems

Author: Harmanci Mehmet Derin
Ienne Paolo
Leblebici Yusuf
Pazos Nuria
Publication venue
Publication date: 19/07/2006
Field of study

Networks-on-Chip are emerging as a widely accepted alternative for the traditional bus architectures. However, their applicability by the system designers is far away from being intuitive due to their lack of predictability. This communication predictability can be obtained statically or dynamically. A dynamic allocation is more suitable for flexible multiprocessor systems and requires the implementation of a Quality-of-Service (QoS) mechanism. This paper explores the main QoS schemes suitable for such systems: connection-oriented and connectionless. The simulation results show that the connectionless scheme provides a better predictability in terms of message latency with an acceptable buffer requirement. This work provides the designer with valuable guidelines to choose a priori the QoS parameters such that they can be confident on the predicted results

Infoscience - École polytechnique fédérale de Lausanne

Crossref